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Abstract 

z ■ 

Let G(A, B) denote the 2-qubit gate which acts as the 1-qubit SU(2) gates A and B 
ON , in the even and odd parity subspaces respectively, of two qubits. Using a Clifford algebra 

formalism we show that arbitrary uniform families of circuits of these gates, restricted to 
act only on nearest neighbour (n.n.) qubit lines, can be classically efficiently simulated. 
42 ' This reproduces a result originally proved by Valiant using his matchgate formalism, 

and subsequently related by others to free fermionic physics. We further show that if 
the n.n. condition is slightly relaxed, to allowing the same gates to act only on n.n. 
and next-n.n. qubit lines, then the resulting circuits can efficiently perform universal 
quantum computation. From this point of view, the gap between efficient classical and 
Q" 1 quantum computational power is bridged by a very modest use of a seemingly innocuous 

resource (qubit swapping). We also extend the simulation result above in various ways. 
, In particular, by exploiting properties of Clifford operations in conjunction with the 

J> ■ Jordan- Wigner representation of a Clifford algebra, we show how one may generalise 

the simulation result above to provide further classes of classically efficiently simulatablc 
■ quantum circuits, which wc call Gaussian quantum circuits. 

^ ■ Keywords: quantum circuits, quantum computational complexity, classical simulation, 

^£ Clifford algebras, matchgates. 
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1 Introduction 

Quantum computation is widely regarded as being more powerful than classical computa- 
tion. Indeed in some scenarios there are provable benefits, such as an exponential reduction 
in communication resources for some distributed computing tasks (e.g. Raz 1999) and 
in quantum cryptography, the ability to communicate with unconditional security against 
eavesdropping. These results depend neither on any computational hardness assumptions 
nor on the presence of any oracle relativisations. Furthermore in suitably relativised oracle 
models of computation there are various known exponential savings in quantum versus clas- 
sical query complexity such as Deutsch and Jozsa 1992 and Simon 1997. However for pure 
(unrelativised) computation there is to date no proof of separation and it is still possible 
that efficient classical and quantum computational power might coincide i.e. the complexity 
classes BPP and BQP might be equal. (See for example Nielsen and Chuang 2000 for a defi- 
nition of these classes. Here and below the term "efficient" is synonymous with "polynomial 
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time"). Note that this is not the same question as the issue of efficient classical simulation 
of quantum processes since any BQP algorithm is a quantum process of only a severely 
restricted kind, required to satisfy infinitely many constraints relative to an infinite set of 
input states viz. all computational basis states. In this paper we will study a representa- 
tion of quantum computation in which the gap (if it exists) between efficient classical and 
efficient quantum computation appears to be surprisingly fragile, being provably bridged 
by a very modest use of a seemingly trivial resource (cf theorems 1 and 2 below). 

We will provide a self contained development of a class of quantum circuits based on 
so-called matchgates, a notion that was introduced in Valiant 2002. Our approach is closely 
related to work of Knill 2001, Terhal and DiVincenzo 2002 and DiVincenzo and Terhal 
2005, relating matchgate circuits to free fermionic quantum computation (and later further 
extended in Bravyi 2005, 2008). Here we will emphasise the underlying mathematical ingre- 
dients and consider some further properties and generalisations that go beyond the fermionic 
formalism. Indeed the existence of a physical interpretation in terms of fermionic physics, 
although interesting, appears to be entirely fortuitous and of no particular consequence for 
our considerations of computational complexity issues per se. 

Consider 2-qubit gates G{A, B) of the form (in the computational basis): 
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where A and B are both in SU(2) or both in £7(2) with the same determinant. Thus the 
action of G(A,B) amounts to A acting in the even parity subspace (spanned by 1 00) and 
1 11)) and B acting in the odd parity subspace (spanned by 1 01) and |10)). Occasionally we 
will wish to consider 2-qubit gates of the form eq. (1) but having det A ^ det B. In this 
case the gate will be denoted G(A, B). To emphasise this distinction we sometimes refer to 
a gate with det A = det B as an allowable G(A, B) gate. We will denote the Pauli operators 
by X, Y, and Z. 



Theorem 1 Consider any uniform (hence poly-sized) quantum circuit family comprising 
only G(A, B) gates such that: 

(i) the G(A,B) gates act on nearest neighbour (n.n.) lines only; 

(ii) the input state is any product state; 

(Hi) the output is a final measurement in the computational basis on any single line. 
Then the output may be classically efficiently simulated. More precisely for any k we can 
classically efficiently compute the expectation value (Zfc)out = (V'outl l^out) = Po ~ Pi 
where is the Pauli Z operator on the A: th line, \ipout) is the final state andpo,p\ are the 
outcome probabilities. 



Theorem 1 is very similar to the classical simulation result of Valiant 2002 and Ter- 
hal and DiVincenzo 2002. Our result is more general in the feature of allowing arbitrary 
product state inputs (rather than just computational basis states). It is more restrictive in 
considering only single bit outputs (rather than individual probabilities of computational 
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basis measurements across many lines) and in not encompassing the adaptive circuits that 
are included in Terhal and DiVincenzo 2002. 

The notion of efficient classical simulation that we will use in this paper is the following. 
Let C n be any uniform family of quantum circuits together with (i) a specified class of input 
states (usually taken to be product states, but we may also restrict to just computational 
basis states) and (ii) a specified class of output measurements (which we take to be a Z- 
basis measurement on any single line). We say that C n is classically efficiently simulatable 
(relative to (i) and (ii)) if the probabilities of measurement outcomes can be computed by 
classical means to m digits of accuracy in poly(n, m) time. 

Note that this ability to efficiently compute the probabilities to exponential accuracy is 
a rather strong notion e.g. we might instead adopt weaker criteria such as the ability 
to compute the probabilities to accuracy l/poly(m) i.e. O(logm) digits in poly(n,m) 
time, or the ability to sample the output probability distribution once (by classical efficient 
probabilistic means, to suitably accuracy). Indeed the last requirement would suffice in 
issues of the comparison of quantum to classical computational power but we will in fact 
achieve the strongest notion above in our results and we thus adopt it as our definition. We 
make further comments about implications of this strong notion of classical simulation in 
the concluding section 7 below. 

Note that if the n.n. G(A, B) circuits in theorem 1 are considered with computational 
basis inputs and also required to satisfy the BQP bounded probability conditions (viz. that 
the output probabilities are always > | or < |), then the ability to classically calculate the 
output probabilities (rather than the ability merely to sample the output distribution once) 
implies that the corresponding decision problem is not just in BPP but actually in P i.e. 
deterministic classical polynomial time. 

In the following sections we will first prove a universality result for G(A, B) gates, if 
these gates are also allowed to act on next-n.n. lines in addition to the n.n. lines of 
theorem 1. Then we give some background on the origin of the notion of matchgates, 
which first lead to the consideration of circuits of G(A, B) gates in Valiant 2002. Next 
we consider a formalism of anti-commuting variables that form a Clifford algebra, leading 
to a proof of theorem 1. This approach is essentially the one given in Knill 2001 and 
Terhal and DiVincenzo 2002, but we give some more transparent proofs and we develop 
further properties. First, we elucidate the n.n. condition (i) in theorem 1: we show that 
the general class of simulatable gates comprises a uniformly describable family that may 
act on any number of the n qubits, in which the G(A, B) gates appear as the subset of 
n.n. 2-qubit gates. We show furthermore that all gates in the family can be obtained as 
circuits of n.n. G(A, B) gates (hence adding nothing new) and that non-n.n. G(A, B) gates 
are not generally in the family. Second, by considering Clifford operations 1 (i.e. n-qubit 
unitary operations that normalise the n-qubit Pauli group in U(2 n )) in conjunction with the 
Clifford algebra formalism, we will describe an avenue for generalising theorem 1 and give 
some examples of simulatable circuits which cannot be obtained as circuits of n.n. G(A, B) 
gates only. Since all our classes of classically simulatable circuits comprise gates that are 
generated by Hamiltonians expressible as quadratic elements of a Clifford algebra, we call 

lr The appellation "Clifford" here, commonly used in quantum computation literature, appears not to be 
mathematically related to the well established notion of Clifford algebra in mathematics generally. 
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them Gaussian quantum circuits. 



2 Universality of n.n. and next-n.n. G(A, B) gates 

The n.n. condition in theorem l(i) is perhaps a surprising ingredient but it is crucial: it 
was already mentioned in Terhal and DiVincenzo 2002 (based on a result in Kempe et al. 
2001) that n.n. G(A, B) gates together with the swap gate SWAP, or equivalently G(A, B) 
gates acting on arbitrary pairs of qubit lines, can perform universal quantum computation. 
We will prove a stronger result: 

Theorem 2 Let C n be any uniform family of quantum circuits with output given by a Z 
basis measurement on the first line. Then C n may be simulated by a circuit of G(A, B) 
gates acting on n.n. or next n.n. lines only (i.e. on line pairs at most distance 2 apart) 
with at most a constant increase in the size of the circuit. 

Before the proof we make a few remarks. As an immediate corollary we have that any 
BQP algorithm can be simulated by a poly-sized circuit of G(A, B) gates acting only on 
n.n. and next-n.n. lines. This fact together with theorem 1 shows that a very limited use 
of the seemingly innocuous operation SWAP (on n.n. lines) allowing n.n. G(A, B) gates to 
act on lines just one further apart, suffices to bridge the gap between classical and quantum 
efficient computational power. The result becomes perhaps even more striking if we note 
that SWAP itself is very close to being expressible in the allowed G{A, B) form. Indeed 
SWAP = G(I, X) and fails only through a mere minus sign in det X = — det /. Thus if we 
drop the detA = detB condition in eq. (1), then the resulting G(A, B) gates acting on n.n. 
lines become efficiently universal for quantum computation. 

The significance of SWAP (or equivalently the ability of 2-qubit gates to act on distant 
lines) for quantum computational power appears also in a different context. Using the 
formalism of tensor network contractions it may be shown (Markov and Shi 2008, Jozsa 
2006 and Yoran and Short 2006) that any poly-sized quantum circuit of 1- and 2- qubit 
gates, which has log depth and in which the 2-qubit gates are restricted to act at bounded 
range (i.e. on line pairs at most distance c apart, for some constant c) may be classically 
efficiently simulated. It is also known (Cleve and Watrous 2000) that Shor's quantum 
factoring algorithm (Shor 1997) can be implemented as a log depth circuit but its 2-qubit 
gates act on distant lines, 0{n) apart, which is not bounded with increasing input size 
n. Thus from this point of view, the quantum advantage of the algorithm (over classical 
algorithms) rests entirely on the presence of unboundedly distant actions (or unbounded use 
of SWAP) . Also it is shown in Terhal and DiVincenzo 2004 and Jozsa 2006 that all depth 
2 circuits (followed by a measurement) are classically efficiently simulatable even if 2-qubit 
gates act on arbitrary line pairs while the same simulation result for depth 3 circuits (with a 
suitably strong notion of classical simulation) would imply equality of BPP and BQP. Here 
again the feature of unboundedly distant action is essential, whereas our result in theorems 
1 and 2 achieves full efficient quantum computational power by passage from distance one 
to just bounded distance two. 

Proof of theorem 2: Given any uniform quantum circuit family we may assume w.l.o.g. 
that it comprises n.n. controlled- Z gates (n.n. CZ) and 1-qubit gates generically denoted 
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Figure 1: Encoded universality by n.n. and next n.n. G(A, B) gates. The logical single- 
qubit unitary gate A and the logical two-qubit CZ gate are illustrated in (a) and (b), 
respectively. 

as A. 

We start with a quadrupled number of qubit lines and encode the original input |0)'s 
and |l)'s as logical basis states |0l) = 1 0000) and = 1 1001) respectively in consecutive 
blocks of 4 lines each (cf the remark after the proof). Then with suitably encoded gate 
operations the whole computation will stay within tensor products of span{|0L) , of 
each quadruple of qubits. 

On any such quadruple of lines, say 1234, we can perform the encoded 1-qubit gate A 
as the following sequence of allowed n.n. gates (depicted in figure 1 (a)): 

G(Z, X) 12 G(Z, X) u G(A, A) 23 G(Z, X) 12 G(Z, X) u (2) 

(where the subscripts denote the line numbers). To see this, note that 

G(Z,X) = G(Z,I)G(I,X) = (CZ){SWAP), 

so that G(Z, X)i 2 G(Z, X)^ can be thought of (for our logical basis states) as just swapping 
lines 1 and 4 into positions 2 and 3. In view of the form of the encoding |0l) and |1l), 
the logical qubit is then encoded in the {|00) , |11)} subspace of lines 2 and 3 so G(A, A) 
will apply the 1-qubit gate A to it. Finally the lines are swapped back to their original 
positions, restoring the encoding. 

To perform an encoded CZ on two consecutive quadruples, say 1234 and 5678 we simply 
apply CZ45 i.e. CZ on the "crossover" pair of lines 4 and 5. Indeed for any pair of basis 
states |a6cd) 1234 and \efgh) 567s we'll get a minus sign iff d = e = 1, giving the correct action 
on any encoded \xl) and \ul)- Next note that since composition of G(A, B) gates amounts 
to multiplying the A's and .B's separately we obtain (with all gates acting on lines 4 and 5) 

CZ45 = G(Z, I) = G(H, H)G(X, I)G(H, H) = G(H, H)G(X, X)G(I, X)G(H, H) (3) 

where H = -^(X + Z) is the Hadamard operator. In the last expression, G(I, X) is SWAP 
and all other gates are allowable G(A,B) gates. This implementation of CZ45 is depicted 
in figure 1 (b). 
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Finally note that in an arbitrary such circuit, SWAP is used only on "crossover" pairs 
(4,5), (8,9) etc. of the encoding quadruples (1,2,3,4), (5,6,7,8), (9,10,11,12) etc. Hence no 
line is ever moved more than one position distant from its original location, by overall action 
of any number of such SWAPs. We may commute all these SWAP operations out to the 
output end of the circuit. In so doing, any line of each (originally n.n.) G(A,B) gate in 
eq. (2) may be moved by at most one place but we can never move both lines of any n.n. 
G(A, B) gate in view of the block size 4 of the encoding. Thus the resulting circuit (with 
SWAP's eliminated) comprises only G(A, B) gates acting on n.n. or next-n.n. lines, as 
required. 

In this process SWAP gates need to be commuted across G(X, X) and G(H, H) gates 
(cf eq. (3)). But for any G(A, B) we have (using SWAP = G{I,X)) that 

G(A,B) SWAP = SWAP (SWAP G(A, B) SWAP) = SWAP G(A,XBX) 

and the last gate is an allowed G(A, B) gate. Since the whole computation is engineered to 
represent the original given circuit, recoded in the {\0l) = |0000) = 1 1001}} subspaces 
of consecutive line quadruples, a final measurement on line 1 will produce the same output 
distribution as a measurement on qubit 1 in the original given circuit. This completes the 
proof of theorem 2. □ 

We remark that instead of the quadruple encoding above, we might have considered the 
simpler |0l) = 1 00) and |1l) = |H) as a potentially more natural choice. Indeed in that 
case the 1-qubit gate A is applied very simply as G(A, A) (in contrast to eq. (2)) but the 
SWAP's from the CZ actions may now move both lines of a n.n. G(A, B) gate in opposite 
directions, resulting in G(A, i?)'s on lines up to distance 3 (rather than just 2) apart. 

3 Perfect matchings and matchgates 

Before beginning our development of theorem 1 we give some brief background remarks on 
the interesting provenance of Valiant's notion of matchgates. (These remarks will not be 
used in any further results). Matchgates arose (Valiant 2002, 2007) in the context of the 
theory of perfect matchings in graphs. For a graph G a perfect matching is a set M of edges 
such that each vertex is the endpoint of exactly one edge in M. It is known that the problem 
of counting the number of perfect matchings in a graph is computationally very hard (being 
complete for the complexity class #P, cf. Papadimitriou 1994) but for planar graphs it is, 
remarkably, computable in polynomial time, using the Fisher-Kasteleyn-Temperley (FKT) 
algorithm (Kasteleyn 1961, Temperley and Fisher 1961 and Jerrum 2003). 

More generally we may consider weighted graphs G in which each edge (ij) is assigned 
a weight Wij and introduce the so-called match sum: 

PerfM (G)= £ J] w iy (4) 

perfect matchings (*-?) eA ^ 
M 

Then the FKT algorithm provides a polynomial time computation of the match sum for 
any planar graph. 
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Next consider a (planar) weighted graph with a designated set {vi, . . . , v n } of "input" 
vertices and a disjoint designated set {v[, . . . , v' m } of "output" vertices, and consider the 
indexed collection (tensor) 

M n...in = p er f M (G%"f ) 

Jl---Jm ^ v Jl---Jm / 

where each index takes values and 1 and G! 1 '"*" is the graph obtained from G by deleting 
all those input and output vertices (together with their incident edges) that have corre- 
sponding index i or j set to 1. Hence the tensor components are each computable in poly 
time by the FKT algorithm. 

Matchgates are essentially these tensors with some additional technical modifications 
(Valiant 2002, 2007) whose specification we will omit here. Suffice it to say that the full 
definition is chosen so that a circuit of matchgates, representing contraction of a matchgate 
tensor network, corresponds to the problem of evaluating the match sum of Gtot where Gtot 
(with possibly some residual uncontracted input and output vertices) is the graph obtained 
from the graphs of the individual matchgates by identifying (or "glueing along" ) input and 
output vertices that are contracted in the tensor network. It follows that the components of 
the contraction are computable in poly time too. This is essentially the content of Valiant's 
so-called Holant theorem (Valiant 2007). The expression and clarification of the Holant 
theorem in terms of tensor contractions (and its invariance under appropriate basis changes 
in representing the tensors) was developed in a series of works by Cai and Choudhary 
2006a,b. 

For some choices of graphs and weights (with equal numbers m = n of input and output 
nodes) the matchgate tensors can be unitary i.e. unitary operations on m qubits. Hence 
the above formalism leads to a class of quantum circuits (comprising unitary matchgates) 
that can be classically efficiently simulated. For example it may be shown (Valiant 2002) 
that the unitary gates G(A, B) in eq. (1) arise as matchgates with 2 input and 2 output 
vertices. 

The FKT algorithm (Kasteleyn 1961, Temperley and Fisher 1961 and Jerrum 2003) 
proceeds by setting up a suitable antisymmetric incidence matrix A of the graph's weights 
and then computing PerfM(G) as the Pfaffian of A (which in fact equals y/detA). Like the 
determinant of an arbitrary matrix, the definition of Pfaffian of an antisymmetric matrix is 
an expression involving exponentially many terms a priori yet computable on polynomial 
time. It is known that Pfaffians also occur in the mathematical formalism of fermionic 
quantum physics which suggests that there may be some relationship (or at least some form 
of translation of basic problems) between fermionic physics and perfect matchings in graphs. 
Indeed soon after the appearance of Valiant's work (Valiant 2002) on classical simulation 
of matchgate quantum circuits, Knill 2001 and Terhal and DiVincenzo 2002 provided an 
interpretation of it in terms of fermionic quantum gates and this formalism was subsequently 
further developed by Bravyi 2005, 2008. 
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4 Clifford algebras, quadratic Hamiltonians and classical sim- 
ulation 



We now return to developing a formalism for treating theorem 1 and some generalisations. 
For n qubit lines, we introduce the set of 2n hermitian operators which satisfy the 
anti-commutation relations, 

{c M , c u } = c^Cy + CyCy, = 25^1 fM, v = 1, . . . , 2n. (5) 

These relations define a Clifford algebra (?2n on 2n generators whose elements are arbitrary 
complex linear combinations of products of generators 2 . Since each generator squares to 
the identity a general element in the algebra may be expressed as a polynomial of degree 
at most 2n, 

-^ii—ife^i • • • °ik (^) 

ii<...<tj, 

(where the index set . . . , i^} may be empty). It follows from eq. (5) that the monomials 
in the sum are linearly independent so as a vector space Cm has dimension 2 2n = 2 n x 2 n . 
Hence (hermitian) matrix representations of the c^'s will involve matrices of size 2" x 2 n . 

Operators satisfying eq. (5) arise in the formalism of fermionic physics where they are 
known as Majorana spinors. In that formalism we start with a set of operators a±,...,a n 
associated to n free fermionic modes, satisfying the standard anti-commutation relations 
for fermionic creation and annihilation operators: 

{ai, aj} = aiaj + ajai = = {a\, at} {oj, aj} = % I. 

Then as a consequence of these relations, the following hermitian operators (which are 
fermionic analogues of position and momentum operators): 

C2k-i = ak + a\ c 2 k = (ajfc - k=l,...,n, 

satisfy the Clifford algebra relations eq. (5). However we emphasise that in the present 
paper we are not concerned with the study of free fermions per se but rather, consideration 
of general quantum circuit simulation properties, based on the Clifford algebra structure, 
which can also go beyond the fermionic formalism (such as the statement in theorem 2 and 
results in section 6 below). 

For theorem 1 and generalisations we will (in later sections) consider matrix representa- 
tions of the Clifford algebra but we first develop some further abstract algebra. A quadratic 
Hamiltonian is an element of Cm of the form, 

2n 

H = i ^2 h ^c^c u (7) 

2 As mentioned in Knill 2001 and Somma et ai. 2006, it is possible to consider 2n + 1 anti-commuting 
operators to define the Clifford algebra and correspondingly to have SO(2n + 1) symmetry in theorem 3 
below. However, this extension appears not to lead to a significant generalization of our results. 
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where h^ u is a 2nx2n matrix of coefficients. Note that we omit fx = v terms which contribute 
only an overall additive constant to H. S ince c^c v — — c^c^ and imposing H — we may 
w.l.o.g. take hnv to be a real antisymmetric matrix. 

Given H we consider the unitary operation U = e tH (where the exponential is calculated 
in the algebra of C2 n as the power series). Any such unitary operation corresponding to a 
quadratic Hamiltonian is called a Gaussian operation. The following result from fermionic 
linear optics (cf Knill 2001, Terhal and DiVincenzo 2002, DiVincenzo and Terhal 2005, 
Bravyi 2005) will be basic for our classical simulation results and we include a simple proof 
of it. 

Theorem 3 Let H be any quadratic Hamiltonian and U = e lH the corresponding Gaussian 
operation. Then for all \i: 

2n 
u=l 

where the matrix R is in SO{2n), and we obtain all of SO(2n) in this way. In fact R = e ih . 
Proof: Write as c M (0) and introduce c^{t) = U {t)c^)U {tf with U(t) = e iHt . Then 

(with square brackets [o, h\ denoting the commutator ab — 6ci). But [c Ul c V2 , 1 = if 
u 7^ ^1,^2 and [c^Cj,,^] = -2c„ (using eq.(5)) so 



dc^t) 



dt 



yi^h^Cytt) and hence c M (t) = ^ R^ v (t)c u (Q) 



where R = e 4ht . It is well known that antisymmetric matrices are the infinitesimal genera- 
tors of rotations and the theorem follows by just setting t = 1. □ 

The significance of theorem 3 for us is the following: note that e lH generally involves all 
products of all generators so the expression Uc^U^ could potentially finish up anywhere in 
the exponentially large (2 2n -dimensional) linear space C^n- However it always happens to 
stay within the polynomially small (2n-dimensional) subspace spanned by just the generators 
themselves. We exploit this feature of the adjoint representation (cf. also Somma et al. 2006 
for a more general Lie-theoretic setting), using the following strategy. 

We find a hermitian representation of the c M 's on n-qubit operators and then the Gaus- 
sian operations corresponding to quadratic Hamiltonians define a class of n-qubit unitary 
gates. Let U be any circuit of these with |^> ut) — ^IV'in) for some choice of input state. 
Then by theorem 3 for each u we have the expectation value 

2n 

(c M )out = <*0i«x| U^CfJj IV'in) = ^2 ^ in ' C ^ ( 8 ) 

where R^ is the product of all SO(2n) matrices corresponding to the individual gates of 
M. Hence the full matrix R^ u is poly time computable. 
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Now suppose further that |^i n ) = . . . |£ n ) is a product state and that c M is represented 
by a product operator Pi ... P n . Then (^j n | c M |V>in) = YYi=i (&I Pi ^ s a ^ so P orv time 
computable and hence (c M ) ou t is poly time computable for each \i. 

However we really want (^fc)out = Po ~ Pi where Zf. is the Pauli Z operator acting on 
the fc th line. Recall that C 2n as a vector space has dimension 2 n x 2 n so it spans all n-qubit 
matrices in our representation. Thus Z^ must be expressible as some polynomial of the 
form eq. (6). If this polynomial has a constant degree, independent of n, then (Zk) out will 
be poly time computable too. As an example suppose Z\ = —ic\c 2 . Then 

(Zi)^ = (*M (-i)U j c lC2 U |^in> = (rpin\ (-i)(U j c 1 U)(Wc 2 U) |^ in ) 

2n 

= ^2 ^l^i^2i/ 2 (V'inl (--Qc^C^ IV^in) • (9) 
^1 7^2 = 1 

If the Cfj, are product operators then so are all the monomials such as c Ul c U2 and 
(ipin\ c Vl c U2 \ip- m ) will be poly time computable for any product state input |V>in)- Note 
also that the size of the sum in eq. (9) is 0(n 2 ) (compared to the 0(n) sized sum for 
(c M )out in eq. (8) and hence (Zi) out is poly time computable too. This argument is easily 
generalized to give the following result. 

Theorem 4 Consider any poly-sized circuit of Gaussian gates acting on a product input 
state. If the observable Zj~ in the final measurement is expressible in C 2n is a polynomial 
of degree d, then for each of its monomials the corresponding sum as in eq. (9) for (Zk) ovl t 
will be 0(n d ) sized and hence (Zk) ou t will be poly time computable if d does not increase 
with n. 



5 The Jordan- Wigner representation and theorem 1 

Introduce the 2n hermitian operators on n-qubits (omitting tensor product symbols (£> 
throughout): 

Cl =XI...I c 3 = ZXI...I ••• c 2k ^ = Z...ZXI...I ••• 

c 2 = Y I ...I c A = ZY I ...I ■■■ c 2k = Z ...ZY I ...I ■■■ l j 

where X and Y are in the k th slot for c 2k ~\ and c 2 k, and k ranges from 1 to n. Thus 
the operators c 2 k-i,c 2 k are associated to the k th qubit line. It is straightforward to check 
that these matrices satisfy the relations eq. (5) so we have a representation of the Clifford 
algebra C 2n , known as the Jordan- Wigner representation (Jordan and Wigner 1928). This 
is in fact the unique representation of C 2n up to a global unitary equivalence. Furthermore, 
Zfc = — ic 2 k-ic 2 k, which has bounded degree two, and the c^'s are all product operators. 
Hence for any poly sized circuit of Gaussian gates with a product state input, (Zk) ut is 
poly time computable. But what do these Gaussian gates actually look like? 

Consider first just qubit lines 1 and 2 (i.e. c±, c 2 ,cs and C4) and corresponding quadratic 
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Hamiltonians which involve 6 possible terms: 



— ic±C2 = ZI —1C2C3 = XX 
W1C3 = YX — ZC2C4 = XY 
ic\Ct± = YY —1C3C4 = IZ. 

These operators are all trace free and all preserve the even and odd parity subspaces. 
Hence the corresponding 6 parameter family of Gaussian gates must be SU (2) © SU (2) 
decomposed relative to the two parity subspaces. More explicitly we may first construct 
the Pauli X, Y, Z operators acting within the two subspaces (e.g. ^(XX + YY) is X acting 
in the odd subspace relative to the {|01) , |10)} basis, and maps the even subspace to zero) 
and generate the two S77(2)'s by direct exponentiation. Hence we get precisely the G(A, B) 
gates for lines 1 and 2 as the Gaussian operations U = e tH with the quadratic Hamiltonian 
of eq. (7) restricted to use of c±, C2, C3, C4 only. Similarly for any pair of consecutive lines we 
get all n.n. G(A, B) gates (since for lines k,k + 1 the initial Z operators in eq. (10) are 
eliminated in all quadratic products c^c u and the calculation proceeds exactly as above). 

Thus all n.n. G(A, B) gates are Gaussian for the Jordan- Wigner representation and this 
completes the proof of theorem 1. 

But there are still more Gaussian gates, generated by quadratic Hamiltonians involving 
more c^'s associated to a larger number and more distant lines. Note first that if we use 
only the four c^'s associated to a pair of not-n.n. lines, we do not get the corresponding 
(now non-n.n.) G(A,B) gate acting on those lines. For example consider the quadratic 
term C2C4 (associated to n.n. lines 1 and 2) replaced by C2CQ (being the corresponding 
operator associated to lines 1 and 3). We have C2C4 = X1Y2 but C2Cq = X1Z2Y3. Hence 
exponentiation of the latter does not correspond to exponentiation of XY for lines 1 and 3 
but gives a gate acting nontrivially across all three lines. 

In summary so far, we see that non-n.n. G(A, B) } s are not generally Gaussian. But n.n. 
G(A, B) f s are all examples of Gaussian operations, albeit only special cases in the full set 
of such operations that may generally act on any number of qubit lines. Finally we show 
that these apparently more general Gaussian operations actually bring nothing new, in the 
context of circuits of gates: 

Theorem 5 Let H = i v h^c^Cy be any quadratic Hamiltonian with corresponding 
Gaussian gate V = e tH on n qubits. Then V as an operator on n qubits is expressible 
as a circuit of 0(n 3 ) n.n. G{A,B) gates i.e. V = UnUn-i ■ ■ - U\ where each Uj = e lH j 
having Hj = i v h^c^Cy with the sum involving only four c 's associated to two n.n. lines 
viz. c 2fc _i, c 2 k, c 2 fc+i, c 2 fc+2 for some fixed k. 

Note that (as shown in the proof below) the circuit expression of theorem 5 is exact, 
analytic, and explicitly describable in poly-time, in contrast to an alternative standard, but 
generally inefficient, asymptotic decomposition utilising the Lie- Trotter expansion (for an 
exponential of a sum of generally non-commuting operators). 
Proof: Let V = e lH be any Gaussian operation as above. We have 

2n 



11 



with R £ SO(2n). We can efficiently decompose R into its generalized Euler angles (by the 
algorithm of section 4 in Hoffman et al. 1972), obtaining R = r\r 2 ■ ■ ■ rjv/ where M = 0(n 2 ) 
and each r,- is a rotation in 2n dimensions that acts non-trivially only in 2-dimensions, 
spanned by say the o th and 6 th co-ordinates. Thus Vj = e Ahj where hj is an antisymmetric 
matrix with nonzero values (denoted ±0/2) only in its a th and 6 th columns and rows. Then 
introduce Hj = i6c a Cb so Uj = e i has 

2n 

In this construction c a and q, do not generally belong to n.n. qubit lines. To remedy this 
we introduce the n.n. "modified swap" operation (Bravyi and Kitaev 2002) defined, for 
example for lines 1 and 2, by 

5i 2 = exp (—(-cxci + c 2 c 3 + cic 2 + c 3 c 4 )) . (11) 

We can readily verify that 

S\ 2 C\Si2 = c 3 , S[ 2 c 2 Si2 = c 4 

i.e. S12 swaps the roles of the pairs (ci, c 2 ) and (c 3 , c 4 ). Similarly we have Sk^+i f° r an Y n - n - 
line pair to swap pairs (c 2 k-i, c 2 k) and (c2fc+i, c 2 k+ 2 )- Note that the exponent in eq. (11) is 
n.n. quadratic so S12 is a n.n. Gaussian gate. In fact in the Jordan- Wigner representation 
we get S12 = (CZ)(SWAP) = G(Z,X). 

Returning to Uj and Hj, if c a and q, are not associated to n.n. qubit lines we can use 
a ladder of S^^k+i conjugations to express Uj as a product of at most 0(n) n.n. G(A, B)'s. 
Thus starting from U we obtain a product U = Un ■ ■ ■ U\ of at most 0(n 3 ) n.n. G(A, B) 
gates such that V^c^V = Wc^U for all c^. Hence this relation holds for all monomials 
c«! ■ • • c Mfe too and thus for arbitrary matrices M (as the monomials span all, matrices) i.e. 
V^MV = U^MU for all M so U = e iS V for some overall phase 5, which may be set to zero 
by a further trivial G(A, B) gate. □ 

We remark that theorem 5 has a direct application in digital quantum simulation (al- 
gorithmic quantum simulation by the set of elementary gates) of a ID quantum system 
whose Hamiltonian H is describable in the form of eq. (7). In particular, this includes the 
ID XY Hamiltonian which exhibits a quantum phase transition for suitable choice of its 
parameters. We see that the real-time dynamics of the XY Hamiltonian for any length of 
time t can be efficiently quantumly simulated in terms of n.n. G(A, B) gates. Another 
efficient circuit simulation of the XY Hamiltonian was described recently in Verstraete et 
al. 2008. 

6 Gaussian quantum circuits intertwined by Clifford opera- 
tions 

Recall (c.f. Nielsen and Chuang 2000) that the Pauli group V n on n qubits contains all n-fold 
tensor products P\ ® . . . <g) P n of Pauli matrices (i.e. each Pj is /, X, Y or Z) together with 
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overall factors of ±1 and ±i. An n-qubit operation T is a Clifford operation iff AT £ V n 
for all A 6 P n i.e. conjugation by T preserves the product Pauli structure. It is known 
(Gottesman 1997) that a unitary operation T is a Clifford operation iff it can be expressed 
as a circuit of controlled-NOT (CNOT), Hadamard H and P = diag(l, i) gates. 

With this in mind recall also that the Jordan- Wigner representation of Cm comprises not 
only product operators, but Pauli products. It is easy to verify that if a set of hermitian 
operators c M satisfy the Clifford algebra relations eq. (5) then so do c' = V^c^V for 
any unitary V. Now recall that our classical simulation result relied upon the quadratic 
Hamiltonian property in theorem 3 - which in turn rests on the algebra relations eq. (5) - 
and the product structure of the matrix representation (associated to product state inputs). 
Hence if we choose V in c'^ = V^c^V to be a Clifford operation T we preserve both features 
and we can obtain new classes of classically efficiently simulatable quantum circuits using 
the Gaussian operations provided by the c^'s (assuming that the conditions of theorem 4 
are also satisfied). Note that the Clifford operation T itself cannot generally be obtained 
as a Gaussian gate of the original Clifford algebra representation c^, nor thus by a circuit 
of n.n. G(A, B) gates. 

The conjugation action of T can be taken outside the quadratic Hamiltonian and the 
exponential power series sum, showing that the new Gaussian gate U ncw = T^U \&T is just 
the original one U Q \& (e.g. n.n. G(A, -B)'s) conjugated by T. In a new circuit comprising new 
Gaussian gates U new , the intermediate Clifford operations T can be viewed as cancelling 
each other by the unitarity identity TT^ = /. Thus we can alternatively think of these new 
simulatable circuits as being the same as the old ones but the input states are now T l^m^id) 
(now generally entangled) and the final measurement is now TZ^T^ (now generally a multi- 
line observable) rather than itself, i.e. we extend the class of allowed inputs and measured 
outputs in theorem 1 while maintaining classical efficient simulatability. From this point of 
view the new freedom associated to use of Clifford operations T appears at the boundary 
of circuits, which is analogous to Valiant's use of basis changes in his notion of holographic 
algorithm (Valiant 2007). 

In our construction T is generally a global (n-qubit) operation, and we will require two 
further features: 

(a) The Pauli operator Z^ should be expressible as a bounded degree polynomial in the c^'s. 
Recall that the classical simulation cost depends on this degree d as 0(n d ) (cf. theorem 4), 
and it was previously quadratic, but with arbitrary T's we may get d = 0(n). 

(b) We wish to identify suitably local new gates U new acting on say, some constant number 
K of qubit lines. For general T operators, even the conjugates of n.n. G(A, B) gates may 
become global n-qubit operators, so we may for example, seek Clifford T's such that these 
particular conjugates remain X-local for some K. In contrast to (a) this requirement is 
not essential for the existence of an efficient classical simulation but it is desirable in view 
of the usual notion of quantum circuit as comprising local gates each acting on a bounded 
number of lines. 

We also remark that, in the above construction, we need to choose a Clifford operation 
T n for each number n of qubit lines. A curious feature is that in addition to being able to 
vary the structure of T n with n, each T n need not itself be "translationally uniform" across 
the n lines whereas the class of all n.n. G(A, B) gates as a whole does have a translationally 
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Figure 2: Intertwining Clifford operations T for the examples 2 and 3 in (a) and (b) respec- 
tively, in order to change the representation of the Clifford algebra. 

uniform structure. Hence we can obtain classically simulatable quantum circuits which have 
different kinds of gates allowed on different sections of the qubit line set. 

To conclude this section we give three illustrative examples of this construction. 

Example 1. Clearly any circuit T of SWAP operations is a Clifford operation. In 
this case T^G(A, B) n . n T amounts to allowing the G{A,B) gates to act on correspondingly 
selected distant lines. However in any such resulting Gaussian circuit the lines may always 
be simply re-ordered to restore all G(A,Bys to n.n. status. 

Example 2. Let CNOTij denote the n-qubit operation that applies the 2-qubit CNOT 
gate with control line % and target line j. Let Hi denote the 1-qubit Hadamard gate on line 
i. Consider the (translationally uniform) Clifford operation: 

T n = CNOT 12 CNOT 23 . . . CNOT n -\ n H\ H 2 . . . H n 

as depicted in figure 2 (a). Indeed, this T operation is known as a duality transformation 
of a ID quantum system (cf. Plenio 2007). Conjugating the Jordan- Wigner representation 
c M 's of eq. (10), we obtain = T'c^T giving: 

n n 

c' 2k ^ = Z 3 ) d 2k = -Y k ( [J Zj) 

j=k j=k+l 

so that Z k = —ic'2 k C2 k+ i remains quadratic in the generators. The six n.n. quadratic 
Hamiltonian terms on lines k,k + 1 are 

i ("0C2fc-lC2fc+2 - G>iC 2k C 2k+1 + /3lC 2 fc_lC 2 fc + l - f3 2 C 2k C 2k+2 - 7lC2fe-lC2fe - l2C 2 k+\C 2 k+2) 

and correspond in the Jordan- Wigner representation to 

a Y k Y k+1 + aiX k X k+ i + [3iY k X k+ i + (3 2 X k Y k+ i + ^ x Z k + 72-^fc+i- 
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Conjugation by T gives the Hamiltonian (also known as the three-body cluster-state inter- 
action): 

-a X k _iZ k X k+ i - (3iX k _{Y k - 2 Y k X k+1 + ^iX k ^X k + -f 2 X k X k+1 + a\Z k . 

Note that this Hamiltonian (although quadratic in the c^'s) has now become 3-local so that 
the 6 parameter family of n.n. G(A, B)'s on lines k, k + 1 will conjugate to a 6 parameter 
family of 3-local gates on lines k — 1, k, k + 1 (and we omit computation of the explicit form 
of these 3-qubit gates). Since we have expanded into 3 lines we may go back and consider 
arbitrary quadratic Hamiltonian terms in the c^'s associated to lines k — 1, k, k+1, involving 
6 C*2 = 15 parameters. By computing their conjugates under T we find that they all remain 
3-local, giving a 15 parameter family of 3-local Gaussian gates. However by theorem 5, any 
member of this 15 parameter family is obtainable as a circuit of the initial 6 parameter 
family. Finally, we see, by the construction, that arbitrary poly-sized circuits of the 15 
parameter family of 3-local gates, with input product states and a final Z k measurement 
can be classically efficiently simulated. 

Example 3. For odd n consider the translationally uniform Clifford operation T n given 
by 

T n = (CNOT 12 CNOT 34 . . . CNOT n ^ 2n ^)(CNOT 32 CNOT 54 . . . CNOT nn ^) 

as depicted in figure 2 (b). Conjugating the Jordan- Wigner representation we obtain in this 
case: 

<4 = (11^=2 Z j) Y 2l Z 2l+l, 
C 4Z+1 = \Tlj l = 2 Z j) Y 2l Y 2l+lX 2 l+ 2 , 
c 4Z+2 = {nf= 2 Z j) Y 2lX 2 l + lX 2l+2 , 
c 4Z+3 = (l\f=2 Z j) X 2 l+ 2 , 

supplemented by boundary terms 

c[ = X X X 2 c 2 = Y X X 2 c' 3 = Z X X 2 . 

It follows from these expressions that the conjugations of n.n. G(A, B) gates on lines 
k,k + 1 become 4-local gates on lines k, k + 1, k + 2, k + 3. By considering all possible 
quadratic terms of c^'s associated to these 4 lines we obtain for each k, a 13 parameter 
family of Gaussian gates (i.e. not all 8 C 2 quadratic terms remain 4-local under conjugation). 
These are generated by the following Hamiltonians and their commutators: 

for k odd: Z k Z k+ iX k+2 X k+3 , Z k Z k+ iZ k+2 , X k+ iZ k+2 X k+3 , X k X k+ i, 
X k +iX k+2 , X k+2 X k+ 3, Z k , Z k+2 ; 

for k even: X k X k+ iZ k+2 Z k+3 , Z k+ iZ k+2 Z k+3 , X k Z k+ iX k+2 , X k X k+ i, 
X k +\X k+2 , X k+2 X k+ 3, Z k+ i, z k+3 . 

Thus when k is odd Z k is obtained as a quadratic expression in the d's, whereas when 
k is even Z k requires a sixth degree Clifford algebra monomial viz. the product of of 
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Zk-iZ^Zk+i, Zfc-i and Z^+i each of which is in the fc-even list above and hence quadratically 
representable. Thus arbitrary poly-sized circuits of the 26 parameter family of 4-qubit gates 
defined by the Hamiltonians above, are classically efficiently simulatable albeit with a higher 
simulation cost which now scales as 0(n 6 ). 

7 Concluding remarks 

In theorems 1 and 2 we have seen that quantum computational power may be made to 
appear as a surprisingly delicate extension of its classical counterpart. Is it conceivable that 
the passage from n.n. to next-n.n. use of G(A, B) gates may be achieved while maintaining 
classical simulat ability? We relate this question to some more formal complexity theoretic 
considerations after introducing some further terminology. 

Recall that BQP is the class of languages decided by a uniform (poly-sized) family 
of quantum circuits for which, given any input computational basis state, each output 
probability po and p\ is > | or < ^ (with output 1 resp. designating acceptance resp. 
rejection of the input). Introduce PQP (a quantum analogue of the classical class PP, c.f. 
Papadimitriou 1994) to denote the corresponding class of languages for which the bounded 
probability conditions are relaxed to requiring only that po and p\ are > ^ + t^- or < \ — 
Clearly BQP C PQP but we also have NP C PQP (e.g., using a quantum algorithm for SAT 
that simply computes a Boolean function on an equal superposition of all its inputs and 
measures the function output register for values versus 1). Similarly it is straightforward 
to see that PP C PQP but furthermore it may be shown (Watrous 2008) that PP = PQP. 

Now let V n . n . C PQP be the class of languages decided by PQP-circuits of n.n. G(A, B) 
gates. With our strong notion of classical simulation, theorem 1 gives V n . n . C P. Also 
theorem 2 shows that every language in PQP is decidable (relative to the PQP probability 
conditions) by a circuit comprising only n.n. and next-n.n. G(A, B) gates (and applied to 
a suitably restricted set of inputs, encoding strings of 0's and l's). Thus if the latter were 
also classically simulatable we would have P = NP = PP i.e. in the context of the PQP 
probability conditions, an extra supra-classical computational power must be associated to 
the single distance extension of the range of n.n. 2-qubit G(A, B) gates if these classical 
computational complexity classes are to be unequal. 

On the other hand the same analysis carried out relative to the (far more stringent) BQP 
probability conditions (viz. requiring po and p\ to be bounded away from ^ by at least |) is 
less compelling. Indeed it is generally believed (although not proven) that neither NP nor 
PP is contained in BQP so in the context of BQP circuits it becomes less implausible that 
the passage from n.n. to next n.n. G(A, B) circuits might retain classical simulatability 
(now no longer implying equality of P, NP and PP). But then we would have P = BQP. 
Actually, more simply, to obtain BPP = BQP it would suffice to simultaneously relax our 
(very strong) notion of classical simulation to a far weaker requirement viz. the ability 
to merely sample the output distribution once by classical efficient means, in contrast to 
classically efficiently computing the probabilities to exponential accuracy. 
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